Overview

Dataset statistics

Number of variables13
Number of observations999
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory101.6 KiB
Average record size in memory104.1 B

Variable types

Numeric6
Categorical6
Text1

Alerts

Country has constant value ""Constant
Category is highly overall correlated with Sub-CategoryHigh correlation
Discount is highly overall correlated with ProfitHigh correlation
Postal Code is highly overall correlated with Region and 1 other fieldsHigh correlation
Profit is highly overall correlated with DiscountHigh correlation
Region is highly overall correlated with Postal Code and 1 other fieldsHigh correlation
State is highly overall correlated with Postal Code and 1 other fieldsHigh correlation
Sub-Category is highly overall correlated with CategoryHigh correlation
Row ID is uniformly distributedUniform
Row ID has unique valuesUnique
Discount has 462 (46.2%) zerosZeros

Reproduction

Analysis started2024-02-14 08:31:45.796750
Analysis finished2024-02-14 08:31:52.538966
Duration6.74 seconds
Software versionydata-profiling vv4.6.4
Download configurationconfig.json

Variables

Row ID
Real number (ℝ)

UNIFORM  UNIQUE 

Distinct999
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean500
Minimum1
Maximum999
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.9 KiB
2024-02-14T14:01:52.665391image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile50.9
Q1250.5
median500
Q3749.5
95-th percentile949.1
Maximum999
Range998
Interquartile range (IQR)499

Descriptive statistics

Standard deviation288.53076
Coefficient of variation (CV)0.57706152
Kurtosis-1.2
Mean500
Median Absolute Deviation (MAD)250
Skewness0
Sum499500
Variance83250
MonotonicityStrictly increasing
2024-02-14T14:01:52.857870image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.1%
672 1
 
0.1%
659 1
 
0.1%
660 1
 
0.1%
661 1
 
0.1%
662 1
 
0.1%
663 1
 
0.1%
664 1
 
0.1%
665 1
 
0.1%
666 1
 
0.1%
Other values (989) 989
99.0%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
999 1
0.1%
998 1
0.1%
997 1
0.1%
996 1
0.1%
995 1
0.1%
994 1
0.1%
993 1
0.1%
992 1
0.1%
991 1
0.1%
990 1
0.1%

Segment
Categorical

Distinct3
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
Consumer
546 
Corporate
283 
Home Office
170 

Length

Max length11
Median length8
Mean length8.7937938
Min length8

Characters and Unicode

Total characters8785
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowConsumer
2nd rowConsumer
3rd rowCorporate
4th rowConsumer
5th rowConsumer

Common Values

ValueCountFrequency (%)
Consumer 546
54.7%
Corporate 283
28.3%
Home Office 170
 
17.0%

Length

2024-02-14T14:01:53.033597image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-14T14:01:53.180291image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
ValueCountFrequency (%)
consumer 546
46.7%
corporate 283
24.2%
home 170
 
14.5%
office 170
 
14.5%

Most occurring characters

ValueCountFrequency (%)
o 1282
14.6%
e 1169
13.3%
r 1112
12.7%
C 829
9.4%
m 716
8.2%
n 546
 
6.2%
s 546
 
6.2%
u 546
 
6.2%
f 340
 
3.9%
t 283
 
3.2%
Other values (7) 1416
16.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7446
84.8%
Uppercase Letter 1169
 
13.3%
Space Separator 170
 
1.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 1282
17.2%
e 1169
15.7%
r 1112
14.9%
m 716
9.6%
n 546
7.3%
s 546
7.3%
u 546
7.3%
f 340
 
4.6%
t 283
 
3.8%
p 283
 
3.8%
Other values (3) 623
8.4%
Uppercase Letter
ValueCountFrequency (%)
C 829
70.9%
H 170
 
14.5%
O 170
 
14.5%
Space Separator
ValueCountFrequency (%)
170
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8615
98.1%
Common 170
 
1.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 1282
14.9%
e 1169
13.6%
r 1112
12.9%
C 829
9.6%
m 716
8.3%
n 546
6.3%
s 546
6.3%
u 546
6.3%
f 340
 
3.9%
t 283
 
3.3%
Other values (6) 1246
14.5%
Common
ValueCountFrequency (%)
170
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8785
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 1282
14.6%
e 1169
13.3%
r 1112
12.7%
C 829
9.4%
m 716
8.2%
n 546
 
6.2%
s 546
 
6.2%
u 546
 
6.2%
f 340
 
3.9%
t 283
 
3.2%
Other values (7) 1416
16.1%

Country
Categorical

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
United States
999 

Length

Max length13
Median length13
Mean length13
Min length13

Characters and Unicode

Total characters12987
Distinct characters10
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUnited States
2nd rowUnited States
3rd rowUnited States
4th rowUnited States
5th rowUnited States

Common Values

ValueCountFrequency (%)
United States 999
100.0%

Length

2024-02-14T14:01:53.330887image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-14T14:01:53.442588image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
ValueCountFrequency (%)
united 999
50.0%
states 999
50.0%

Most occurring characters

ValueCountFrequency (%)
t 2997
23.1%
e 1998
15.4%
U 999
 
7.7%
n 999
 
7.7%
i 999
 
7.7%
d 999
 
7.7%
999
 
7.7%
S 999
 
7.7%
a 999
 
7.7%
s 999
 
7.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 9990
76.9%
Uppercase Letter 1998
 
15.4%
Space Separator 999
 
7.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 2997
30.0%
e 1998
20.0%
n 999
 
10.0%
i 999
 
10.0%
d 999
 
10.0%
a 999
 
10.0%
s 999
 
10.0%
Uppercase Letter
ValueCountFrequency (%)
U 999
50.0%
S 999
50.0%
Space Separator
ValueCountFrequency (%)
999
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11988
92.3%
Common 999
 
7.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 2997
25.0%
e 1998
16.7%
U 999
 
8.3%
n 999
 
8.3%
i 999
 
8.3%
d 999
 
8.3%
S 999
 
8.3%
a 999
 
8.3%
s 999
 
8.3%
Common
ValueCountFrequency (%)
999
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12987
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 2997
23.1%
e 1998
15.4%
U 999
 
7.7%
n 999
 
7.7%
i 999
 
7.7%
d 999
 
7.7%
999
 
7.7%
S 999
 
7.7%
a 999
 
7.7%
s 999
 
7.7%

City
Text

Distinct174
Distinct (%)17.4%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
2024-02-14T14:01:53.668594image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length16
Median length13
Mean length9.3373373
Min length4

Characters and Unicode

Total characters9328
Distinct characters48
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique45 ?
Unique (%)4.5%

Sample

1st rowHenderson
2nd rowHenderson
3rd rowLos Angeles
4th rowFort Lauderdale
5th rowFort Lauderdale
ValueCountFrequency (%)
new 109
 
7.6%
city 107
 
7.5%
york 100
 
7.0%
san 88
 
6.1%
philadelphia 71
 
4.9%
los 62
 
4.3%
angeles 62
 
4.3%
francisco 60
 
4.2%
chicago 35
 
2.4%
houston 30
 
2.1%
Other values (190) 712
49.6%
2024-02-14T14:01:54.092460image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 840
 
9.0%
a 763
 
8.2%
o 731
 
7.8%
i 664
 
7.1%
n 630
 
6.8%
l 564
 
6.0%
r 496
 
5.3%
s 455
 
4.9%
437
 
4.7%
t 412
 
4.4%
Other values (38) 3336
35.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7455
79.9%
Uppercase Letter 1436
 
15.4%
Space Separator 437
 
4.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 840
11.3%
a 763
10.2%
o 731
9.8%
i 664
8.9%
n 630
 
8.5%
l 564
 
7.6%
r 496
 
6.7%
s 455
 
6.1%
t 412
 
5.5%
c 269
 
3.6%
Other values (14) 1631
21.9%
Uppercase Letter
ValueCountFrequency (%)
C 220
15.3%
S 152
10.6%
A 123
8.6%
N 120
8.4%
P 111
 
7.7%
L 106
 
7.4%
Y 100
 
7.0%
F 88
 
6.1%
D 80
 
5.6%
M 53
 
3.7%
Other values (13) 283
19.7%
Space Separator
ValueCountFrequency (%)
437
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8891
95.3%
Common 437
 
4.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 840
 
9.4%
a 763
 
8.6%
o 731
 
8.2%
i 664
 
7.5%
n 630
 
7.1%
l 564
 
6.3%
r 496
 
5.6%
s 455
 
5.1%
t 412
 
4.6%
c 269
 
3.0%
Other values (37) 3067
34.5%
Common
ValueCountFrequency (%)
437
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9328
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 840
 
9.0%
a 763
 
8.2%
o 731
 
7.8%
i 664
 
7.1%
n 630
 
6.8%
l 564
 
6.0%
r 496
 
5.3%
s 455
 
4.9%
437
 
4.7%
t 412
 
4.4%
Other values (38) 3336
35.8%

State
Categorical

HIGH CORRELATION 

Distinct40
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
California
183 
New York
135 
Texas
92 
Pennsylvania
79 
Ohio
54 
Other values (35)
456 

Length

Max length14
Median length13
Mean length8.4394394
Min length4

Characters and Unicode

Total characters8431
Distinct characters46
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)0.2%

Sample

1st rowKentucky
2nd rowKentucky
3rd rowCalifornia
4th rowFlorida
5th rowFlorida

Common Values

ValueCountFrequency (%)
California 183
18.3%
New York 135
13.5%
Texas 92
 
9.2%
Pennsylvania 79
 
7.9%
Ohio 54
 
5.4%
Illinois 52
 
5.2%
Florida 42
 
4.2%
Michigan 42
 
4.2%
Washington 39
 
3.9%
Colorado 35
 
3.5%
Other values (30) 246
24.6%

Length

2024-02-14T14:01:54.279958image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
california 183
15.5%
new 157
13.3%
york 135
11.4%
texas 92
 
7.8%
pennsylvania 79
 
6.7%
ohio 54
 
4.6%
illinois 52
 
4.4%
florida 42
 
3.6%
michigan 42
 
3.6%
washington 39
 
3.3%
Other values (33) 307
26.0%

Most occurring characters

ValueCountFrequency (%)
a 1023
12.1%
i 988
11.7%
n 819
 
9.7%
o 766
 
9.1%
r 537
 
6.4%
e 498
 
5.9%
l 485
 
5.8%
s 424
 
5.0%
C 248
 
2.9%
N 184
 
2.2%
Other values (36) 2459
29.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7066
83.8%
Uppercase Letter 1182
 
14.0%
Space Separator 183
 
2.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1023
14.5%
i 988
14.0%
n 819
11.6%
o 766
10.8%
r 537
7.6%
e 498
7.0%
l 485
6.9%
s 424
 
6.0%
f 183
 
2.6%
h 181
 
2.6%
Other values (14) 1162
16.4%
Uppercase Letter
ValueCountFrequency (%)
C 248
21.0%
N 184
15.6%
Y 135
11.4%
T 102
8.6%
M 87
 
7.4%
P 79
 
6.7%
I 72
 
6.1%
O 66
 
5.6%
W 49
 
4.1%
F 42
 
3.6%
Other values (11) 118
10.0%
Space Separator
ValueCountFrequency (%)
183
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8248
97.8%
Common 183
 
2.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1023
12.4%
i 988
12.0%
n 819
 
9.9%
o 766
 
9.3%
r 537
 
6.5%
e 498
 
6.0%
l 485
 
5.9%
s 424
 
5.1%
C 248
 
3.0%
N 184
 
2.2%
Other values (35) 2276
27.6%
Common
ValueCountFrequency (%)
183
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8431
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1023
12.1%
i 988
11.7%
n 819
 
9.7%
o 766
 
9.1%
r 537
 
6.4%
e 498
 
5.9%
l 485
 
5.8%
s 424
 
5.0%
C 248
 
2.9%
N 184
 
2.2%
Other values (36) 2459
29.2%

Postal Code
Real number (ℝ)

HIGH CORRELATION 

Distinct221
Distinct (%)22.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean53511.934
Minimum1841
Maximum98661
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.9 KiB
2024-02-14T14:01:54.456331image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum1841
5-th percentile10009
Q119474
median50315
Q385023
95-th percentile95661
Maximum98661
Range96820
Interquartile range (IQR)65549

Descriptive statistics

Standard deviation31507.965
Coefficient of variation (CV)0.58880259
Kurtosis-1.470991
Mean53511.934
Median Absolute Deviation (MAD)31172
Skewness-0.058412222
Sum53458422
Variance9.9275189 × 108
MonotonicityNot monotonic
2024-02-14T14:01:54.823977image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
94110 32
 
3.2%
10024 29
 
2.9%
10035 28
 
2.8%
10009 28
 
2.8%
19140 28
 
2.8%
43229 19
 
1.9%
60610 17
 
1.7%
94122 17
 
1.7%
90032 16
 
1.6%
19134 15
 
1.5%
Other values (211) 770
77.1%
ValueCountFrequency (%)
1841 3
0.3%
1852 4
0.4%
2038 1
 
0.1%
2886 2
0.2%
3301 4
0.4%
6040 4
0.4%
6360 1
 
0.1%
6824 1
 
0.1%
7036 1
 
0.1%
7060 1
 
0.1%
ValueCountFrequency (%)
98661 3
 
0.3%
98270 1
 
0.1%
98198 2
 
0.2%
98115 13
1.3%
98105 13
1.3%
98103 3
 
0.3%
98026 4
 
0.4%
97301 4
 
0.4%
97206 5
 
0.5%
95661 7
0.7%

Region
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
East
313 
West
301 
Central
245 
South
140 

Length

Max length7
Median length4
Mean length4.8758759
Min length4

Characters and Unicode

Total characters4871
Distinct characters14
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSouth
2nd rowSouth
3rd rowWest
4th rowSouth
5th rowSouth

Common Values

ValueCountFrequency (%)
East 313
31.3%
West 301
30.1%
Central 245
24.5%
South 140
14.0%

Length

2024-02-14T14:01:55.003499image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-14T14:01:55.153091image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
ValueCountFrequency (%)
east 313
31.3%
west 301
30.1%
central 245
24.5%
south 140
14.0%

Most occurring characters

ValueCountFrequency (%)
t 999
20.5%
s 614
12.6%
a 558
11.5%
e 546
11.2%
E 313
 
6.4%
W 301
 
6.2%
C 245
 
5.0%
n 245
 
5.0%
r 245
 
5.0%
l 245
 
5.0%
Other values (4) 560
11.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3872
79.5%
Uppercase Letter 999
 
20.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 999
25.8%
s 614
15.9%
a 558
14.4%
e 546
14.1%
n 245
 
6.3%
r 245
 
6.3%
l 245
 
6.3%
o 140
 
3.6%
u 140
 
3.6%
h 140
 
3.6%
Uppercase Letter
ValueCountFrequency (%)
E 313
31.3%
W 301
30.1%
C 245
24.5%
S 140
14.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4871
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 999
20.5%
s 614
12.6%
a 558
11.5%
e 546
11.2%
E 313
 
6.4%
W 301
 
6.2%
C 245
 
5.0%
n 245
 
5.0%
r 245
 
5.0%
l 245
 
5.0%
Other values (4) 560
11.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4871
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 999
20.5%
s 614
12.6%
a 558
11.5%
e 546
11.2%
E 313
 
6.4%
W 301
 
6.2%
C 245
 
5.0%
n 245
 
5.0%
r 245
 
5.0%
l 245
 
5.0%
Other values (4) 560
11.5%

Category
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
Office Supplies
603 
Furniture
206 
Technology
190 

Length

Max length15
Median length15
Mean length12.811812
Min length9

Characters and Unicode

Total characters12799
Distinct characters20
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFurniture
2nd rowFurniture
3rd rowOffice Supplies
4th rowFurniture
5th rowOffice Supplies

Common Values

ValueCountFrequency (%)
Office Supplies 603
60.4%
Furniture 206
 
20.6%
Technology 190
 
19.0%

Length

2024-02-14T14:01:55.323639image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-14T14:01:55.467255image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
ValueCountFrequency (%)
office 603
37.6%
supplies 603
37.6%
furniture 206
 
12.9%
technology 190
 
11.9%

Most occurring characters

ValueCountFrequency (%)
e 1602
12.5%
i 1412
11.0%
p 1206
9.4%
f 1206
9.4%
u 1015
 
7.9%
c 793
 
6.2%
l 793
 
6.2%
O 603
 
4.7%
s 603
 
4.7%
S 603
 
4.7%
Other values (10) 2963
23.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 10594
82.8%
Uppercase Letter 1602
 
12.5%
Space Separator 603
 
4.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1602
15.1%
i 1412
13.3%
p 1206
11.4%
f 1206
11.4%
u 1015
9.6%
c 793
7.5%
l 793
7.5%
s 603
 
5.7%
r 412
 
3.9%
n 396
 
3.7%
Other values (5) 1156
10.9%
Uppercase Letter
ValueCountFrequency (%)
O 603
37.6%
S 603
37.6%
F 206
 
12.9%
T 190
 
11.9%
Space Separator
ValueCountFrequency (%)
603
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 12196
95.3%
Common 603
 
4.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 1602
13.1%
i 1412
11.6%
p 1206
9.9%
f 1206
9.9%
u 1015
8.3%
c 793
 
6.5%
l 793
 
6.5%
O 603
 
4.9%
s 603
 
4.9%
S 603
 
4.9%
Other values (9) 2360
19.4%
Common
ValueCountFrequency (%)
603
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12799
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 1602
12.5%
i 1412
11.0%
p 1206
9.4%
f 1206
9.4%
u 1015
 
7.9%
c 793
 
6.2%
l 793
 
6.2%
O 603
 
4.7%
s 603
 
4.7%
S 603
 
4.7%
Other values (10) 2963
23.2%

Sub-Category
Categorical

HIGH CORRELATION 

Distinct17
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
Binders
145 
Paper
127 
Storage
95 
Furnishings
94 
Phones
88 
Other values (12)
450 

Length

Max length11
Median length9
Mean length7.2162162
Min length3

Characters and Unicode

Total characters7209
Distinct characters28
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowBookcases
2nd rowChairs
3rd rowLabels
4th rowTables
5th rowStorage

Common Values

ValueCountFrequency (%)
Binders 145
14.5%
Paper 127
12.7%
Storage 95
9.5%
Furnishings 94
9.4%
Phones 88
8.8%
Accessories 83
8.3%
Art 81
8.1%
Chairs 58
 
5.8%
Appliances 43
 
4.3%
Labels 41
 
4.1%
Other values (7) 144
14.4%

Length

2024-02-14T14:01:55.637771image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
binders 145
14.5%
paper 127
12.7%
storage 95
9.5%
furnishings 94
9.4%
phones 88
8.8%
accessories 83
8.3%
art 81
8.1%
chairs 58
 
5.8%
appliances 43
 
4.3%
labels 41
 
4.1%
Other values (7) 144
14.4%

Most occurring characters

ValueCountFrequency (%)
s 998
13.8%
e 900
12.5%
r 710
 
9.8%
i 556
 
7.7%
n 528
 
7.3%
a 452
 
6.3%
o 344
 
4.8%
p 289
 
4.0%
h 253
 
3.5%
c 243
 
3.4%
Other values (18) 1936
26.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6210
86.1%
Uppercase Letter 999
 
13.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 998
16.1%
e 900
14.5%
r 710
11.4%
i 556
9.0%
n 528
8.5%
a 452
7.3%
o 344
 
5.5%
p 289
 
4.7%
h 253
 
4.1%
c 243
 
3.9%
Other values (8) 937
15.1%
Uppercase Letter
ValueCountFrequency (%)
P 215
21.5%
A 207
20.7%
B 166
16.6%
F 115
11.5%
S 115
11.5%
C 64
 
6.4%
L 41
 
4.1%
T 33
 
3.3%
E 30
 
3.0%
M 13
 
1.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 7209
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 998
13.8%
e 900
12.5%
r 710
 
9.8%
i 556
 
7.7%
n 528
 
7.3%
a 452
 
6.3%
o 344
 
4.8%
p 289
 
4.0%
h 253
 
3.5%
c 243
 
3.4%
Other values (18) 1936
26.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7209
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 998
13.8%
e 900
12.5%
r 710
 
9.8%
i 556
 
7.7%
n 528
 
7.3%
a 452
 
6.3%
o 344
 
4.8%
p 289
 
4.0%
h 253
 
3.5%
c 243
 
3.4%
Other values (18) 1936
26.9%

Sales
Real number (ℝ)

Distinct907
Distinct (%)90.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean241.339
Minimum1.08
Maximum8159.952
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.9 KiB
2024-02-14T14:01:55.824300image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum1.08
5-th percentile4.8822
Q118.364
median55.98
Q3213.2975
95-th percentile1043.991
Maximum8159.952
Range8158.872
Interquartile range (IQR)194.9335

Descriptive statistics

Standard deviation596.14063
Coefficient of variation (CV)2.470138
Kurtosis76.85455
Mean241.339
Median Absolute Deviation (MAD)46.38
Skewness7.3824622
Sum241097.66
Variance355383.65
MonotonicityNot monotonic
2024-02-14T14:01:56.097540image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12.96 8
 
0.8%
15.552 6
 
0.6%
14.62 4
 
0.4%
32.4 4
 
0.4%
24.56 4
 
0.4%
8.82 3
 
0.3%
18.28 3
 
0.3%
242.94 3
 
0.3%
1199.976 3
 
0.3%
11.52 3
 
0.3%
Other values (897) 958
95.9%
ValueCountFrequency (%)
1.08 1
0.1%
1.112 1
0.1%
1.248 1
0.1%
1.624 1
0.1%
1.68 1
0.1%
1.788 1
0.1%
2.08 1
0.1%
2.2 1
0.1%
2.308 1
0.1%
2.376 1
0.1%
ValueCountFrequency (%)
8159.952 1
0.1%
7999.98 1
0.1%
6354.95 1
0.1%
4355.168 1
0.1%
3991.98 1
0.1%
3347.37 1
0.1%
3083.43 1
0.1%
3059.982 2
0.2%
2999.95 1
0.1%
2735.952 1
0.1%

Quantity
Real number (ℝ)

Distinct14
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.8008008
Minimum1
Maximum14
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.9 KiB
2024-02-14T14:01:56.343881image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median3
Q35
95-th percentile8
Maximum14
Range13
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.2688443
Coefficient of variation (CV)0.5969385
Kurtosis2.1677722
Mean3.8008008
Median Absolute Deviation (MAD)1
Skewness1.328083
Sum3797
Variance5.1476547
MonotonicityNot monotonic
2024-02-14T14:01:56.507472image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
3 248
24.8%
2 240
24.0%
4 122
12.2%
5 102
10.2%
1 90
 
9.0%
7 67
 
6.7%
6 58
 
5.8%
8 32
 
3.2%
9 21
 
2.1%
10 7
 
0.7%
Other values (4) 12
 
1.2%
ValueCountFrequency (%)
1 90
 
9.0%
2 240
24.0%
3 248
24.8%
4 122
12.2%
5 102
10.2%
6 58
 
5.8%
7 67
 
6.7%
8 32
 
3.2%
9 21
 
2.1%
10 7
 
0.7%
ValueCountFrequency (%)
14 4
 
0.4%
13 3
 
0.3%
12 3
 
0.3%
11 2
 
0.2%
10 7
 
0.7%
9 21
 
2.1%
8 32
 
3.2%
7 67
6.7%
6 58
5.8%
5 102
10.2%

Discount
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct12
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.16284284
Minimum0
Maximum0.8
Zeros462
Zeros (%)46.2%
Negative0
Negative (%)0.0%
Memory size7.9 KiB
2024-02-14T14:01:56.658961image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0.2
Q30.2
95-th percentile0.7
Maximum0.8
Range0.8
Interquartile range (IQR)0.2

Descriptive statistics

Standard deviation0.2084592
Coefficient of variation (CV)1.280125
Kurtosis2.0549294
Mean0.16284284
Median Absolute Deviation (MAD)0.2
Skewness1.5959389
Sum162.68
Variance0.043455237
MonotonicityNot monotonic
2024-02-14T14:01:56.831490image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
0 462
46.2%
0.2 380
38.0%
0.7 48
 
4.8%
0.8 27
 
2.7%
0.4 21
 
2.1%
0.3 15
 
1.5%
0.5 13
 
1.3%
0.6 12
 
1.2%
0.1 9
 
0.9%
0.45 5
 
0.5%
Other values (2) 7
 
0.7%
ValueCountFrequency (%)
0 462
46.2%
0.1 9
 
0.9%
0.15 3
 
0.3%
0.2 380
38.0%
0.3 15
 
1.5%
0.32 4
 
0.4%
0.4 21
 
2.1%
0.45 5
 
0.5%
0.5 13
 
1.3%
0.6 12
 
1.2%
ValueCountFrequency (%)
0.8 27
 
2.7%
0.7 48
 
4.8%
0.6 12
 
1.2%
0.5 13
 
1.3%
0.45 5
 
0.5%
0.4 21
 
2.1%
0.32 4
 
0.4%
0.3 15
 
1.5%
0.2 380
38.0%
0.15 3
 
0.3%

Profit
Real number (ℝ)

HIGH CORRELATION 

Distinct940
Distinct (%)94.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18.752018
Minimum-3839.9904
Maximum3177.475
Zeros6
Zeros (%)0.6%
Negative186
Negative (%)18.6%
Memory size7.9 KiB
2024-02-14T14:01:57.021900image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum-3839.9904
5-th percentile-102.8922
Q11.80245
median8.694
Q327.3175
95-th percentile140.94495
Maximum3177.475
Range7017.4654
Interquartile range (IQR)25.51505

Descriptive statistics

Standard deviation225.7271
Coefficient of variation (CV)12.037483
Kurtosis137.46563
Mean18.752018
Median Absolute Deviation (MAD)10.5248
Skewness-1.6471439
Sum18733.266
Variance50952.724
MonotonicityNot monotonic
2024-02-14T14:01:57.227792image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6.2208 7
 
0.7%
0 6
 
0.6%
5.4432 5
 
0.5%
6.8714 4
 
0.4%
15.552 4
 
0.4%
9.936 3
 
0.3%
2.9568 2
 
0.2%
2.8536 2
 
0.2%
6.8768 2
 
0.2%
26.6304 2
 
0.2%
Other values (930) 962
96.3%
ValueCountFrequency (%)
-3839.9904 1
0.1%
-1665.0522 1
0.1%
-1359.992 1
0.1%
-950.4 1
0.1%
-814.4832 1
0.1%
-760.98 1
0.1%
-619.596 1
0.1%
-509.997 1
0.1%
-453.849 1
0.1%
-407.976 1
0.1%
ValueCountFrequency (%)
3177.475 1
0.1%
1995.99 1
0.1%
1415.4296 1
0.1%
1379.977 1
0.1%
1276.4871 1
0.1%
829.3754 1
0.1%
701.9883 1
0.1%
679.996 1
0.1%
636.0003 1
0.1%
601.9699 1
0.1%

Interactions

2024-02-14T14:01:51.240562image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-14T14:01:47.041884image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-14T14:01:47.908582image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-14T14:01:48.726020image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-14T14:01:49.579316image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-14T14:01:50.392273image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-14T14:01:51.370218image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-14T14:01:47.214465image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-14T14:01:48.043231image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-14T14:01:48.863683image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-14T14:01:49.718068image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-14T14:01:50.527797image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-14T14:01:51.500792image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-14T14:01:47.345121image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-14T14:01:48.164925image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-14T14:01:49.000286image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-14T14:01:49.849727image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-14T14:01:50.662097image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-14T14:01:51.645700image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-14T14:01:47.492702image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-14T14:01:48.307376image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-14T14:01:49.148860image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-14T14:01:49.991346image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-14T14:01:50.810698image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-14T14:01:51.783337image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-14T14:01:47.624375image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-14T14:01:48.436921image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-14T14:01:49.285495image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-14T14:01:50.116983image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-14T14:01:50.950296image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-14T14:01:51.929427image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-14T14:01:47.765993image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-14T14:01:48.582432image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-14T14:01:49.435095image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-14T14:01:50.257634image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-02-14T14:01:51.094923image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Correlations

2024-02-14T14:01:57.373008image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
CategoryDiscountPostal CodeProfitQuantityRegionRow IDSalesSegmentStateSub-Category
Category1.000-0.0580.0250.142-0.0030.0000.014-0.0030.0000.0490.993
Discount-0.0581.0000.052-0.5150.0340.261-0.0680.0130.0000.3220.364
Postal Code0.0250.0521.0000.0130.0000.924-0.0870.0310.1360.9550.064
Profit0.142-0.5150.0131.0000.2570.0620.0740.4700.0410.1040.277
Quantity-0.0030.0340.0000.2571.0000.032-0.0580.3740.0000.0000.000
Region0.0000.2610.9240.0620.0321.0000.0320.0430.0790.9820.104
Row ID0.014-0.068-0.0870.074-0.0580.0321.000-0.0240.1200.2300.000
Sales-0.0030.0130.0310.4700.3740.043-0.0241.0000.0000.0340.246
Segment0.0000.0000.1360.0410.0000.0790.1200.0001.0000.2430.000
State0.0490.3220.9550.1040.0000.9820.2300.0340.2431.0000.031
Sub-Category0.9930.3640.0640.2770.0000.1040.0000.2460.0000.0311.000

Missing values

2024-02-14T14:01:52.121914image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-02-14T14:01:52.395816image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

Row IDSegmentCountryCityStatePostal CodeRegionCategorySub-CategorySalesQuantityDiscountProfit
01ConsumerUnited StatesHendersonKentucky42420SouthFurnitureBookcases261.960020.0041.9136
12ConsumerUnited StatesHendersonKentucky42420SouthFurnitureChairs731.940030.00219.5820
23CorporateUnited StatesLos AngelesCalifornia90036WestOffice SuppliesLabels14.620020.006.8714
34ConsumerUnited StatesFort LauderdaleFlorida33311SouthFurnitureTables957.577550.45-383.0310
45ConsumerUnited StatesFort LauderdaleFlorida33311SouthOffice SuppliesStorage22.368020.202.5164
56ConsumerUnited StatesLos AngelesCalifornia90032WestFurnitureFurnishings48.860070.0014.1694
67ConsumerUnited StatesLos AngelesCalifornia90032WestOffice SuppliesArt7.280040.001.9656
78ConsumerUnited StatesLos AngelesCalifornia90032WestTechnologyPhones907.152060.2090.7152
89ConsumerUnited StatesLos AngelesCalifornia90032WestOffice SuppliesBinders18.504030.205.7825
910ConsumerUnited StatesLos AngelesCalifornia90032WestOffice SuppliesAppliances114.900050.0034.4700
Row IDSegmentCountryCityStatePostal CodeRegionCategorySub-CategorySalesQuantityDiscountProfit
989990CorporateUnited StatesAuburnNew York13021EastOffice SuppliesArt17.97030.05.2113
990991Home OfficeUnited StatesJacksonvilleFlorida32216SouthFurnitureChairs1166.92050.2131.2785
991992ConsumerUnited StatesNew York CityNew York10024EastOffice SuppliesBinders14.62420.25.4840
992993ConsumerUnited StatesSan JoseCalifornia95123WestOffice SuppliesFasteners10.23030.04.9104
993994ConsumerUnited StatesSan JoseCalifornia95123WestOffice SuppliesPaper154.90050.069.7050
994995CorporateUnited StatesVirginia BeachVirginia23464SouthOffice SuppliesBinders2715.93070.01276.4871
995996CorporateUnited StatesVirginia BeachVirginia23464SouthTechnologyPhones617.97030.0173.0316
996997ConsumerUnited StatesHendersonKentucky42420SouthOffice SuppliesEnvelopes10.67010.04.9082
997998ConsumerUnited StatesHendersonKentucky42420SouthOffice SuppliesStorage36.63030.09.8901
998999ConsumerUnited StatesHendersonKentucky42420SouthFurnitureFurnishings24.10050.09.1580